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Abstract. Wc consider deflation and augmentation techniques for accelerating the convergence 
of Krylov subspace methods for the solution of nonsingular linear algebraic systems. Despite some 
formal similarity, the two techniques are conceptually different from preconditioning. Deflation (in 
the sense the term is used here) "removes" certain parts from the operator making it singular, 
while augmentation adds a subspace to the Krylov subspace (often the one that is generated by 
the singular operator); in contrast, preconditioning changes the spectrum of the operator without 
making it singular. Deflation and augmentation have been used in a variety of methods and settings. 
Typically, deflation is combined with augmentation to compensate for the singularity of the operator, 
but both techniques can be applied separately. 

We introduce a framework of Krylov subspace methods that satisfy a Galerkin condition. It 
includes the families of orthogonal residual (OR) and minimal residual (MR) methods. We show 
that in this framework augmentation can be achieved either explicitly or, equivalently, implicitly by 
projecting the residuals appropriately and correcting the approximate solutions in a final step. We 
study conditions for a breakdown of the deflated methods, and we show several possibilities to avoid 
such breakdowns for the deflated MinRes method. Numerical experiments illustrate properties of 
different variants of deflated MinRes analyzed in this paper. 

Key words. Krylov subspace methods, augmentation, deflation, subspace recycling, CG, MIN- 
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1. Introduction. There are numerous techniques to accelerate the speed of con- 
vergence of Krylov subspace methods for solving large linear algebraic systems 

Ax = b, (1.1) 

where A £ i^nxn j g nonsingular and b 6 C^. The most widely used technique is pre- 
conditioning. Here the system (1.1) is modified using left- and/or right-multiplications 
with a nonsingular matrix (called the preconditioner) . A typical goal of precondition- 
ing is to obtain a modified matrix that is in some sense close to the identity matrix. 
For surveys of preconditioning techniques we refer to the books by Greenbaum [26, 
Part II] and Saad [4G, Chapters 9-14] and the survey of Benzi [3] . 

Here we consider two approaches for convergence acceleration that are called de- 
flation and augmentation. Let us briefly describe the main ideas of the two techniques. 
In deflation the system (1.1) is multiplied (at least implicitly) with a suitably cho- 
sen projection, and the general goal is to "eliminate" components that supposedly 
slow down convergence. Typically these are components that correspond to small 
eigenvalues. Multiplication by the projection turns the system (1.1) into a consistent 
singular one, which is then solved by a Krylov subspace method. Wc need to men- 
tion, however, that techniques have been proposed that move small eigenvalues of A 
to some large common value, say, to the value 1; see [1, 17, 31]. Some authors refer 
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to these techniques as "deflation" too. In augmentation techniques the search space 
of the Krylov subspace method, which is at the same time the Galerkin test space, is 
"enlarged" by a suitably chosen subspace. A typical goal is to add information about 
the problem to the search space that is slowly revealed in the Krylov subspace itself, 
e.g. eigenvectors corresponding to small eigenvalues. 

Deflation and augmentation techniques can be combined with conventional pre- 
conditioning techniques. Then the projection and augmentation parameters have to 
be adapted to the preconditioned matrix. In this paper, we assume that equation 
(1.1) is already in preconditioned form, i.e., A is the preconditioned matrix and b the 
preconditioned right-hand side. Details of preconditioning techniques will thus not be 
addressed here. 

We will now give a brief overview of existing deflation and augmentation strate- 
gies. For a more comprehensive presentation we refer to Section 9 of the survey article 
by Simoncini and Szyld [49]. The first deflation and augmentation techniques in the 
context of Krylov subspace methods appeared in the papers of Nicolaides [4 : ] and 
Dostal [12]. Both proposed deflated variants of the CG method [29] to accelerate the 
speed of convergence for symmetric positive definite (spd) matrices A arising from 
discretized elliptic partial differential equations. Since these early works deflation and 
augmentation have become widely used tools. Several authors working in different 
fields of numerical analysis applied them to many Krylov subspace methods, and they 
use a variety of techniques to determine a deflation subspace. A review of all appli- 
cations is well beyond this introduction. We concentrate in the following on some — 
but not all — key contributions. 

For nonsymmetric systems Morgan [36] and also Chapman and Saad [6] extracted 
approximate eigenvectors of A from the Krylov subspace generated by the GMRes 
method [47], and then they augmented the Krylov subspace with these vectors; for 
related references we refer to [22]. A comparable approach in the context of the 
CG method for spd matrices A was described by Saad, Yeung, Erhel, and Guy- 
omarc'h [48]. De Sturler [10] introduced the GCRO method, which involves an outer 
GCR iteration [15, 16] and an inner deflated GMRes method where the space used 
for deflation depends on the outer iteration. This method has been extended to 
GCROT in [11] to incorporate truncation strategies when restarts are necessary. 
In [ ] Kolotilina used a twofold deflation technique for simultaneously deflating the r 
largest and the r smallest eigenvalues by an appropriate deflating subspace of dimen- 
sion r. An analysis of acceleration strategies (including augmentation) for minimal 
residual methods was given by Saad [45] and for restarted methods by Eiermann, 
Ernst and Schneider [14]. The latter work analyzes minimal residual (MR) and or- 
thogonal residual (OR) methods in a general framework that allows approximations 
from arbitrary correction spaces. By using multiple correction spaces forming a di- 
rect sum, several cases of augmentation and deflation are discussed. The analysis 
concentrates on (nearly) A-invariant augmentation spaces. 

In [37] Morgan proposed a block-GMRES method for multiple right-hand sides 
that deflates approximated eigenvectors when GMRes is restarted. A similar method 
for solving systems with multiple shifts and multiple right-hand sides has been intro- 
duced by Darnell, Morgan and Wilcox [9]. Giraud et al. [25] recently developed a 
flexible GMRes variant with deflated restarting where the preconditioner may vary 
from one iteration to the next. In [42] Olshanskii and Simoncini studied spectral prop- 
erties of saddle point matrices preconditioned with a block-diagonal preconditioner 
and applied a deflated MinRes method to the resulting symmetric and indefinite 
matrix in order to alleviate the influence of a few small outlying eigenvalues. The- 
oretical results for deflated GMRes based on an exactly A-invariant subspace have 
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been presented in [61]. 

In addition to deflation/augmentation spaces based on approximative eigenvec- 
tors, other choices have been studied. Mansfield [34] showed how Schur complement- 
type domain decomposition methods can be seen as a series of deflations. Nico- 
laides [ 1 1] constructed a deflation technique based on piecewise constant interpolation 
from a set of r subdomains, and he pointed out that deflation might be effectively 
used with a conventional preconditioner. In [35] Mansfield used the same "subdomain 
deflation" in combination with damped Jacobi smoothing, and obtained a precon- 
ditioner that is related to the two-grid method. Baker, Jessup and Mantcuffel [2] 
proposed a GMRes method that is augmented upon restarts by approximations to 
the error. 

In [38, 39, 40] Nabben and Vuik described similarities between the deflation ap- 
proach and domain decomposition methods for arbitrary deflation spaces. This com- 
parison was extended to multigrid methods in [54, 53]. 

This brief survey indicates that in principle deflation or augmentation can be 
incorporated into every Krylov subspace method. However, some methods may suf- 
fer from mathematical shortcomings like breakdowns or numerical problems due to 
round-off errors. The main goal of this paper is not to add further examples to the 
existing collection, but to introduce first a suitable framework for a whole family of 
such augmented and deflated methods (Section 2) and then to prove some results 
just assuming this framework (Section 3). The framework focuses on Krylov subspace 
methods whose residuals satisfy a certain Galerkin condition with respect to a true or 
formal inner product. In Section 3, we mathematically characterize the equivalence of 
two approaches for realizing such methods and discuss them along with potential pit- 
falls. We then discuss known approaches to deflate CG (Section 4), GMRes (Section 
5), and MinRes (Section 6) in the light of our general equivalence theorem. Among 
other results, this will show that a recent version of deflated MinRes, which is part 
of the RMinRes ("recycling" MinRes) method suggested by Wang, de Sturler and 
Paulino [57], can break down and how these breakdowns can be avoided by cither 
adapting the right-hand side or the initial guess. We do not focus on specific imple- 
mentations or algorithmic details but on the mathematical theory of these methods. 
For the numerical application in Section 6.3 we draw on the most robust MinRes 
implementation that is available. 

2. A framework for deflated and augmented Krylov methods. In this 
section we describe a general framework for deflation and augmentation, which simul- 
taneously covers several Krylov subspace methods whose residuals satisfy a Galerkin 
condition. Given an initial guess x £ C^, a positive integer n, an n-dimensional 
subspace S n of C N , and a nonsingular matrix B £ C NxN , let us first consider an 
approximation x„ to the solution x of the form 

x n Gx +5„, (2.1) 

so that the corresponding residual 

r„ := b Ax„ E r + AS n 

satisfies 

r n J_ BS n . (2.2) 

If B H A is Hcrmitian and positive definite (Hpd) then B H A induces an inner prod- 
uct (' j ') b h a ' a corresponding norm || • ||b h Ai an d an orthogonality J-b h a- Imposing 
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equations (2.1) and (2.2) can then be seen to be equivalent to solving the following 
minimization problem: 

find x„ex +5„ s.t. ||x-x„|| b h a = min ||x-y|| B H A . (2.3) 

y6x +<S„ 

Note that due to r n = A(x — x n ) the condition (2.2) can be written as orthogonality 
condition for the error x — x„: 

(x - X„) J-B H A Sn- (2.4) 

The following two cases where B H A is Hpd are of particular interest: 

(1) B = I if A itself is Hpd; 

(2) B = A for general nonsingular A. 

The case (1) is the one where (2.2) is a typical Galerkin condition: A is Hpd and the 
residual r„ is orthogonal to the linear search space S n for x„ — xo. In (2.3) we then 
have 

||x-x n || A = IKIU-i, (2.5) 

so while the error is minimal in the A-norm, the residual is minimal in the A -1 norm. 

In this paper we will refer to (2.2) also in the case (2) as a Galerkin condition, 
because the search space and the test space are still essentially the same. However, 
in this case 

ll x - x «IIa h A = ll r ri||2, (2.6) 

so (2.2) implies that the 2-norm of the residual is minimized. Consequently, in both 
cases a minimization property holds. 

If the search space S n is the n-th Krylov subspace generated by A and the initial 
residual r := b — Ax , i.e., if 

S n = JC n (A, r ) := span {r , Ar , . . . , A n_1 r }, (2.7) 

then, in the case (1), conditions (2.1)-(2.2) mathematically characterize the CG 
method [29]. It is the prototype of an Orthogonal Residual (OR) method charac- 
terized by (2.1) and (2.2) with B = I. 

In the case (2), conditions (2.1)-(2.2) with S n = JC n (A,ro) mathematically char- 
acterize the GCR [15] and GMRes [47] methods and, for Hermitian A, the MiN- 
Res [43] method. If A is even Hpd, we can resort to Stiefel's Conjugate Residual 
(CR) method [ ]. All these are prototype Minimal Residual (MR) methods charac- 
terized by (2.1) and (2.2) with B = A. 

Orthogonal Residual and Minimal Residual methods often come in pairs defined 
by the properties of A, the Krylov search space, and, to some extent, the fundamen- 
tal structure of the algorithms. Examples of such pairs are CG/CR, GCG/GCR, 
FOM/GMRES, and CGNE/CGNR. It has been pointed out many times, see, e.g., 
[4, 8, 13, 14, 27], that the residuals of these pairs of OR/OM methods and in particular 
the residual norms are related in a simple fashion. 

A fact related to the OR/MR residual connection is that the iterates and residuals 
of an MR method can be found from those of the corresponding OR method by a 
smoothing process introduced by Schonauer; see [58, 60, 27, 28]. The reverse process 
also exists [27]. Again these processes hold for the residuals of the deflated system 
and, since they are identical, for those of the explicit augmentation approach. 
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If B H A is not Hpd, the minimization property (2.3) no longer makes sense, but 
we may still request that the orthogonality condition (2.2) or, equivalently, (2.4) 
hold. Resulting algorithms may then break down since an approximate solution x„ 
satisfying the conditions may not exist for some n. Nevertheless, such methods are 
occasionally applied in practice. In particular, the choice 

(3) B = I and A nonsingular 

covers the full orthogonalization method (FOM) of Saad [44, 46] , which is sometimes 
also referred to as Arnoldi method for linear algebraic systems. 

For minimizing the error x„ — x in the 2-norm one has to choose 

(4) B = A~ H and A nonsingular. 

Since multiplication by A~ H is not feasible, these methods only work for particular 
search spaces; the simplest choice is 

S n = A H /C„(A H ,r ). (2.8) 

Unlike the normal Krylov search space of (2.7), this one has the drawback that the 
(exact) solution of the system need not be in one of these spaces, i.e., even in ex- 
act arithmetic convergence is not guaranteed. One interesting example based on this 
choice is the Generalized Minimum Error (GMErr) method of Weiss [ ]. Earlier, for 
spd matrices, such a method was proposed by Friedman [24], and an alternative algo- 
rithm was mentioned by Fletcher [21]. Symmetric indefinite systems can be treated in 
this way with the SymmLQ algorithm of Paige and Saunders [43]; see also Freund [23] 
for a review of methods featuring this optimality criterion and yet another algorithm 
called ME to achieve it. 

Finally, we can easily incorporate the CGNR method [29] for solving overdeter- 
mined linear systems in the setting of (2.1) and (2.2) by choosing the appropriate 
Krylov search space. Given such a system 

Ex = f (2.9) 

with a full-rank M x TV-matrix E (where M > N) , the corresponding normal equations 
are E H Ex = E H f, i.e., Ax = b with A := E H E and b := E H f. Since A is Hpd, we 
can apply the CG method which corresponds to the case (1) and 

5„=/C„(E H E,E H s ) (2.10) 

with Sq := f — Exq. In this situation we have to distinguish between the residuals 
r„ := b Ax„ = E H f — E H Ex„ of the normal equations and the residuals s„ := f — Ex„ 
of the given system (2.9). The CGNR method allows one to keep track of both. The 
latter residuals satisfy 

s„ e s + E/C„(E H E,E H f), s„ i_E/C„(E H E,E H f), (2.11) 

and they can be seen to minimize the 2-norm of s n . Note that it can be viewed as an 
MR method with a possibly non-square B = E; see [27]. 

A method that also fits into our framework, though with some modifications, is 
the CGNE method, also called Craig's method [7], which can also be used for solving 
underdetermined linear algebraic systems (2.9) with a full-rank M x TV-matrix E. 
The search space for x„ in this case is (2.10), but the Galerkin condition becomes 
Sn i-/C„(EE H ,s ). 
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Since we are aiming at a general framework, let us for the moment consider an 
arbitrary, possibly singular matrix A £ C NxN and an arbitrary vector v £ C , such 
that the Krylov subspace /C n (A,v) has dimension n. 

Instead of a search space of the form S n = JC n (A,ro) we focus from now on 
augmented Krylov subspaces of the form 

S n :=K n {A,v)+U. (2.12) 

We suppose that U has dimension k, < k < N, and denote by U E i^Nxk a ma trix 
whose columns form a basis of U, and by V„ £ C 1 *" 1 one whose columns form a basis 
of /C n (A,v), so that (2.1) can be written as 

x„ = x + V n y„ + Uu„ (2.13) 

for some vectors y n <E C" and u„ £ C k . Of course, U may be redefined when an 
algorithm like GMRes is restarted, but we will not account for that in our notation. 

Assuming the general structure of the search space <S„ in (2.12) we will now 
investigate augmented Galerkin-type methods that still satisfy (2.1) and (2.2). 

3. A general equivalence theorem. Our goal in this section is to show that 
augmentation can be achieved either explicitly as in (2.12), or implicitly, namely by 
projecting the residuals appropriately and correcting the approximate solutions in a 
final step. Our main result is stated in Theorem 3.2 below. 

In order to satisfy (2.2), the residual r n = b Axo must be orthogonal to both 
B/C„(A,v) and B£Y, hence it must satisfy the pair of orthogonality conditions 

r„ _LB/C„(A,9) and r„ 1 BM. (3.1) 

Let us concentrate on the second condition of (3.1), which can be written as 

= U H B H r n = U H B H (r - AV„y n - AUu n ) = U H B H (r - AV„y n ) - E B u n , 

where 

E B := U H B H AU £ C kxk . (3.2) 

Clearly, if B H A is Hpd, then Eb is Hpd too — though in a smaller space — and thus 
nonsingular. In the following derivation, B H A need not be Hpd, but we must then 
assume that Eb is nonsingular. Then the second orthogonality condition is equivalent 
to 

u„ = Eb'I^B" (r - AV„y„) . (3.3) 
Substituting this into (2.13) gives 

x„ = x + V„y n + U (E B 1 U H B H (r - AV„y„)) 

= (I - UEb 1 U h B h A) (xo + V„y n ) + UEg 1 U H B H b, (3.4) 
r„ = r - AV„y„ - AU (Ei 1 U H B H (r - AV„y„)) 

= (I - AUE B 1 U H B H ) (r - AV„y„) . (3.5) 

To simplify the notation we define the (N x iV)-matrices 

M B := UE B 1 U H = U (U H B H AU) _1 U H , 
P B :=I AM B B H , (3.6) 
Q B :=I M b B h A. 
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Using these matrices the equations (3.4) and (3.5) take the form 

x„ = Q B (x + V„y n ) + M B B H b, (3.7) 
r„ = P B (r - AV n y„) . (3.8) 

Note that imposing the second orthogonality condition in (3.1) on the residual r„ has 
determined the vector u„, which has therefore "disappeared" in (3.7)-(3.8). We next 
state some basic properties of the matrices Pb and Qb- The proof of these properties 
is straightforward, and is therefore omitted. 

Lemma 3.1. Let A,B G C NxN and U G C Nxk be such that E B := U H B H AU 
is nonsingular (which implies that rankU = k). Then the matrices in (3.6) are well 
defined and the following statements hold: 

1. P B = Pb, PbAU = 0, and U H B H P B = 0, i.e., Pb is the projection onto 
(BU) 1 - along MA. 

Qb = Qb, QbU = 0, and U h B h AQb = 0, i.e., Qb is the projection onto 
(A H BW) ± along U. 
3. P B A = P B AQ B = AQ b . 

4- Pa = Pa.' z '- e v Pa is an orthogonal projection. 
It remains to impose the first orthogonality condition in (3.1), which will deter- 
mine the vector y„. To this end, let 

x n := x + V„y„ e x + /C„(A,v), 

so that by (3.7) x n = Qbx„ + MBB H b. Using the definition of Pb in (3.G) and 
statement 3 of Lemma 3.1, this orthogonality condition reads 

r„ = b - Ax„ = b AQ B x„ - AM B B H b = P B (b - Ax„) _L B/C„(A, v). 

We summarize these considerations in the following theorem. 

Theorem 3.2. Let the assumptions of Lemma 3.1 hold and let A G C NxN , 
V G C N and n G N be such that the Krylov subspace /C n (A,v) has dimension n. 
Furthermore, let b,xo G C N be arbitrary. 

Then, with U := im (U) and the definitions from (3.6) the following two pairs of 
conditions, 

x„ G x + /C„(A,v) + U, 

r„ := b - Ax„ i_ B/C„(A, v) + BU, 

and 

x n G x + /C n (A,v), ^ 
r n := P B (b - Ax n ) _L B/C„(A, v). 

are equivalent for n > 1 in the sense that 

x„ = Qbx„ + M B B H b, and r n = r„. (3.11) 



We call (3.9) the explicit deflation and augmentation approach because the aug- 
mentation space U is explicitly included in the search space. The equivalent conditions 
(3.10) show that the explicit inclusion of U can be omitted when instead we first con- 
struct the iterate x„ G x +/C„(A, v) so that the projected residual f„ = PB(b — Ax„) 
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satisfies the given orthogonality condition and then apply the affine correction (3.11) 
to x„, whose projected residual equals the one of x n . We call this second option the 
implicit deflation and augmentation approach. 

Note that the theorem makes no assumption on relations between A, v, and hi. 
The only assumption on the augmentation space U is that the matrix U H B H AU is 
nonsingular. (Clearly, if this holds for one basis of U it holds for all.) Moreover, 
in the theorem A and v are arbitrary except for the assumption that /C„(A,v) has 
dimension n. 

In practice, A and v should be somehow related to A, however. One specific 
choice is suggested by Theorem 3.2, in particular (3.10). If 



A := P B A, v 

then (3.10) becomes 



ro 



P B r = P B (b — Ax ) and b := P B b 



x„ e x + /C„(A,? ), 

r„ := b Ax„ _L B/C n (A,r ), 



(3.12) 



which is a formal Galerkin condition for the (consistent and singular) deflated system 
Ax = b. Based on the Jordan form of A we show in the following theorem how the 
Jordan form of A = PbA looks like when (1) U is a right invariant subspace or (2) 
is a left invariant subspace of A. 

Theorem 3.3. Suppose that the matrix A £ £NxN ^ as a partitioned Jordan 
decomposition of the form 



A = SJS- 1 = [Si S 2 ] 



Ji 
J 2 



(3.13) 



where Si,§i S C Nxk , S 2 ,§ 2 e C Nx( - N ~ k \ J ± S C kxk , and J 2 e £(N-k)x(N-k) 
Then the following assertions hold: 

(1) IfU = im(Si) ) U e C Nxk is any matrix satisfying im (U) = U anrfU H B H AU 
is nonsingular, then 



PrA 



U P B S 2 





J 2 



u P B S 2 



(3.14) 



ith [ u p B s 2 ] 1 = [ buo^bu)- 1 s 2 



(2) If BU = im (Si), U e C A,xfc is any matrix satisfying im (U) = U and 
U H B H AU is nonsingular, then 



A = P B A = [ U S 2 ] 




J 2 



[ u s 2 y 1 



(3.15) 



with [ U S 2 ] 1 = [ BU^BU)- 1 Q B S 2 



In particular, in both cases the spectrum A(A) of A is given by A(A) = {0}UA(J 2 ). 
Proof. (1) From Lemma 3.1 we can see that 

P B AU = and (BU(U H BU) _1 ) H P B A = 0. 
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By construction, there exists a nonsingular matrix R £ 
P B = I - U(U H B H U)- 1 U H B H and from §3% 
thus 



-<kxk 



with AU = UR. 



we conclude that S^U 



Hence 
and 



S^P B A = S^A = JaSl^. 



Furthermore, 

P B A(P B S 2 ) = P B AS 2 - P B AU(U H B H U)- 1 U H B H S 2 = (P B S 2 )J 2 



and the proof of (1) is complete after recognizing that 



(U H B H U)- 1 U H B H 



S H 



[ u P B S 2 ] 



I 
I 



The proof of (2) is analogous to (1). □ 

Results like the previous theorem motivate the term "deflation", which means 
"making something smaller", since the multiplication with the operator P B "removes" 
certain eigenvalues from the operator A by "moving them to zero". Special cases of 
the results shown in Theorem 3.3 have appeared in the literature: in particular, for 
spd matrices A and B = I in the works of Frank and Vuik [22] and Nabbcn and 
Vuik [38, 39], and for nonsymmetric A and B — I in the articles by Erlangga and 
Nabben [19, 20] and Yeung, Tang and Vuik [61]. 

4. Hermitian positive definite matrices and CG. This section presents 
some well-known results for deflation and augmentation techniques within the frame- 
work described in Section 2 in the case where A is Hpd. The first proposed deflated 
Krylov subspace methods for Hpd matrices are the deflated CG variants of Nico- 
laides [41] and Dostal [12]. With a full-rank matrix U S C^**, U = im (U) and 
B = I both essentially apply the CG method to the deflated system 



Ax 



where A := PiA, 



P T b. 



(4.1) 



Here, Pi = I — AU(U H AU) _1 U H is the projection onto U ± along AU as defined in 
(3.6) when B = I. Moreover, Q t = P" is then the projection onto (AU) 1 - along U. 
Note that all the matrices in (3.6) are well defined because Ei = U H AU is Hpd if 
A is Hpd. Clearly, the deflated matrix A is Hermitian but singular, since Pi is a 
nontrivial projection if < k < N. In fact, this matrix A is positive semi-definite, 
since 

v H Av = v H PiAv = v H P^Av = v h Pi(PiA)v = v h Pi(PiA) h v = v^iAPjV > 

holds for any v £ C N . The system (4.1) is consistent since it results from a left- 
multiplication of the nonsingular system Ax = b by Pi. (We note that in [41, 12] the 
application of the projection Pi to b is carried out implicitly by adapting the initial 
guess such that the initial residual is orthogonal to U = im (U).) The solution x of 
Ax = b thus also solves Ax = b, but in Eqn. (4.1) we replaced x by x to indicate the 
non-uniqueness of the solution. In fact, the general solution is x = x + h with h £ U 
since Pi Ah = AQih = if and only if Qih = 0, that is, h £ U; see statement 2 
of Lemma 3.1. The application of Qi in the final correction (3.11) will annihilate h. 
Note that a deflation version including the final correction (3.11) is used by Frank 
and Vuik [22] and Nabben and Vuik [38, 39]. 

In the context of Hpd matrices the application of the CG method to a deflated 
system like (4.1) is a commonly used technique; see, e.g., [ ] for a survey of results. 
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Finally, we point out that A as defined in (4.1) is completely determined by A and 
the choice of the space U. 

According to Nicolaides [41, Section 3] and Kaasschieter [30, Section 2], the CG 
method is well defined (in exact arithmetic) for each step n until it terminates with 
an exact solution, when it is applied to a consistent linear algebraic system with a real 
and symmetric positive semidefinite matrix. This result easily generalizes to complex 
and Hcrmitian positive semidefinite matrices. 

Mathematically, the n-th step of the CG method applied to the deflated system 
(4.1) with the initial guess Xo and the corresponding initial residual f o = b — Axo is 
characterized by the two conditions 

x„ S x +/C n (A,r ), 

r„ = b - Ax„ = P^b - Ax„) _L /C„(A,f ), 

This is nothing but the set of conditions (3.10) in Theorem 3.2 with B = I. In the 
sense of relation (3.11) these conditions have been shown to be equivalent to (3.9), 
namely 

x„ G x + /C n (A,r ) + U, 

r„ = b Ax„ _L /C„(A,r ) +U, 

which is the starting point of the theory for the deflated CG method developed 
in [48], where the authors also showed the equivalence between CG with explicit 
augmentation and CG applied to the deflated system (4.1); see Section 4 in [48], in 
particular Theorem 4.6. In a partly similar treatment, Erhel and Guyomarc'h [18] 
considered an augmented and deflated CG method where the augmentation space IA 
is itself a Krylov space. It is worth mentioning that both Saad et al. [48, Eqn. (3.12)] 
and Erhel and Guyomarc'h [18, Eqn. (3.2)] use the initial correction 

x := x_i + Mir_! with r_i := b Ax_! 

to replace a given initial approximation x_i by one with ro _L hi; in fact, it is easily 
seen that ro = Pir_i. 

The goal of deflation is to obtain a deflated matrix A whose "effective condition 
number" is smaller than the one of A, for example by "eliminating" the smallest eigen- 
values of A. A detailed analysis of spectral properties of PiA and other projection- 
type preconditioners arising from domain decomposition and multigrid methods was 
carried out in [39] and [54]. In particular, it was shown in these papers that the 
effective condition number of A is less than or equal to the condition number of A 
for any augmentation space U. Moreover, if A = A(A) is the spectrum of A and U is 
an A-invariant subspace associated with the eigenvalues = {8i, . . . , Ok} C A, then 
the effective 2-norm condition number is 

«2(A) = — 5—. 

mm AeA \ A 

In summary, for any augmentation space U, the CG method applied to the (sin- 
gular) deflated system (4.1) is well defined for any iteration step n, and it terminates 
with an exact solution x (in exact arithmetic). Once CG has terminated with a so- 
lution x of the deflated system, we can obtain the uniquely defined solution of the 
original system using the final correction step 



x = Qix + Mib 



11 



(cf. (3.11)), which indeed gives 

Ax = AQix + AMib = PiAx + AMib = (Pi + AMi)b = b. 

This computation is mathematically equivalent to an explicit use of augmentation. Of 
course, in practice we stop the CG iteration for the deflated system once the solution 
is approximated sufficiently accurately. We then use the computed approximation x„ 
and equation (3.11) from Theorem 3.2 to obtain an approximation x„ of the solution 
of the given system Ax = b. Note that, according to (3.11), the residual r„ = b — Ax„ 
of the projected system (4.1) is equal to the residual r„ = b — Ax„ of the original 
system (1.1). 

5. Non-Hermitian matrices and GMRes. In this section we present mostly 
known results on applying versions of deflated GMRes to a general nonsingular matrix 
A. We set B = A in the framework of Section 2 and discuss some choices for A, v 
and U. 

Morgan [3G] and also Chapman and Saad [(>] presented variations of GMRes that 
can be mathematically described by (3.9) with A = A and v = b Ax . Hence they 
augmented the search space with an augmentation space U but did neither deflate 
the matrix nor project the linear system onto a subspace of C N . 

Erlangga and Nabben [19] used two matrices Y, Z S C k to define the abstract 
deflation operator P Y z := I - AZ(Y H AZ)- 1 Y H for non-Hermitian matrices A. Of 
course, this choice needs the assumption of nonsingularity of Y H AZ. Requiring Y 
and Z to have full rank obviously is not sufficient. They then applied GMRes to the 
deflated linear system PyzAx = Pyzb. 

De Sturler [10] introduced the GCRO method, which is a nested Krylov subspace 
method involving an outer and an inner iteration. The outer method is the GCR 
method [15, 16], while the inner iteration uses the projection 

P A = I AM A A H = I - AU(U H A H AU)- 1 U H A H 

to apply several steps of GMRes to the projected (or deflated) linear system 

Ax = b, where A := P A A, b := P A b. (5.1) 

In GCRO the matrix U is determined from the corrections of the outer iteration. 
Clearly, the matrix E A = U H A H AU is nonsingular for any matrix U € C Nxk with 
rank U = k > 0, so that all matrices in (3.6) are well defined. Note that the projection 
P A is equal to the abstract deflation operator Pyz of Erlangga and Nabben with the 
choice Z = U and Y = AU. For the application of P A only the matrix W := AU is 
needed because P A = I — W(W H W)~ 1 W H . De Sturler further simplified this in [10, 
Section 2] to P A = I — CC H by choosing a matrix C g c^xfe w h ose columns form 
an orthonormal basis of im (AU). 

Here we concentrate on the GMRes method applied to the deflated system (5.1) 
and we first discuss some known results within the framework presented in Section 2. 
Analogously to the approach for CG described in the previous section, the deflated 
system (5.1) results from the given system Ax — b by a left-multiplication with P A 
which projects onto (AU) 1 - along AU. Note that P A is an orthogonal projection, 
since P A is Hermitian. 

If we start GMRes with an initial guess Xo and the corresponding initial residual 
?o = b— Axo = P A (b— Axq), then the iterate x„ and the residual f„ are characterized 
by the two conditions 

x„ G x + /C„(A,? ), and f „ = b Ax„ _L A/C„(A,r ). 
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If the columns of V„ form a basis of /C,*(A,ro)< then the second condition means that 
= V^A H f„ = V."A H P A r„ = V r > H P A P A (b - Ax n ) = V"A H P A (b - Ax„) 

= v> H f„, 

or, equivalently, 

r„ _L A/C„(A,f ). 

Note that here the Krylov subspace is multiplied with A instead of A and that this 
condition has precisely the form of the second condition in (3.10). Theorem 3.2 now 
implies that the mathematical characterization of GMRes applied to the deflated 
system Ax = b is equivalent to the explicit use of augmentation, i.e., the conditions 

x„ ex + /C„(A? )+W, (5-2) 

r„ = b-Ax„± A/C„(A,?o)+AW, (5.3) 

in the sense that 

x n = Q A x„ + M A A H b, and r„ = b - Ax„ = b - Ax„ = f„. (5.4) 

As mentioned in the beginning of Section 2, conditions (5. 2)- (5. 3) are equivalent to 
the minimization problem 

find x„ e x + S„ s.t. ||b — Ax n || 2 = min ||b — Ay|| 2 

y£x +5„ 

with the search space S n = /C„(A,?o) + U. In the setting of GCRO, where U is 
determined from the GCR iteration, the equivalence between GMRes applied to 
Ax — b and the minimization problem with an explicitly augmented search space has 
already been pointed out by de Sturler [10, Theorem 2.2]. The GCRO method was 
extended to an arbitrary rank-fc matrix U in [32, Section 2]. In the case where U is 
an A-invariant subspace the equivalence is straightforward and has been pointed out 
by Eiermann, Ernst and Schneider [14, Lemma 4.3]. 

Again the deflated matrix A is singular, and we have to discuss whether the 
application of GMRes to the deflated system yields (in exact arithmetic) a well- 
defined sequence of iterates that terminates with a solution. This turns out to be 
significantly more difficult than in the case of the CG method. Properties of GMRes 
applied to singular systems have been analyzed by de Sturler [10] and by Brown and 
Walker [5]. The following result is an extension of [5, Theorem 2.6]. 

Theorem 5.1. Consider an arbitrary matrix A. £ C Wxfv and a vector b G im (A) 
(i.e., the linear algebraic system Ax = b is consistent). Then the following two 
conditions are equivalent: 

1. For every initial guess xo € the GMRes method applied to the system 
Ax = b is well defined at each iteration step n and it terminates with a 
solution of the system. 

2. ker(A)nim(A) = {0}. 

Proof. It has been shown in [5, Theorem 2.6] that condition 2 implies condition 1. 
We prove the reverse by contradiction. We assume that ker (A)nim (A) ^ {0}, and we 
will construct an initial guess for which GMRes does not terminate with the solution. 
For a nonzero vector y £ ker (A) n im (A) there exists a nonzero vector y S C^, such 
that y = Ay, and since Ax = b is consistent, there exists a vector x € with 
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b = Ax. Then the initial guess xo := x — y gives ro = b Axq = b Ax + Ay = y. 
But since y G ker(A), we obtain Ar = 0, so that the GMRes method terminates 
at the first iteration with the approximation xo, for which ^ = y ^ 0. Thus, for 
this particular initial guess xo the GMRes method cannot determine the solution of 
Ax = b. □ 

The situation that the GMRes method terminates without finding the exact 
solution is often called a breakdown of GMRes. The above proof leads to the following 
characterization of all initial guesses that lead to a breakdown of GMRes at the first 
iteration. 

Corollary 5.2. Let A e C NxN and x,b e C NxN such that Ax = b. Then the 
GMRes method breaks down at the first iteration for all initial guesses 

x e Xo ■= {x - y I Ay e ker (A) \ {0}}. 

We next have a closer look at condition 2 in Theorem 5.1. If we had ker (A) = 
ker (A H ), then im (A)- 1 = ker (A H ) would imply 

{0} = im (A) 1 - n im (A) = ker (A H ) n im (A) = ker (A) n im (A), 

so that condition 2 would hold. Thus condition 2 in Theorem 5.1 is fulfilled for any 
Hcrmitian matrix A. For a general non-Hcrmitian matrix, however, it seems difficult 
to determine a deflated matrix with ker (A) = ker(A H ). However, for the deflated 
system (5.1) we can derive another condition that is equivalent with condition 2 (and 
hence condition 1) in Theorem 5.1. 

Corollary 5.3. For the deflated system (5.1), condition 2 in Theorem 5.1 is 
satisfied if and only iflA PI (AU)- 1 = {0}. In particular, the latter condition is satisfied 
when U is an exactly A-invariant subspace, i.e., when AlA = hi. 

Proof. Using the properties of the projection Pa from Lemma 3.1 and the fact 
that A is nonsingular, we obtain 

ker (A) = ker (P A A) = A _1 ker (P A ) = U, 

im (A) = im (P A A) = im (P A ) = (AU) ± . 

If AU = U, then U n (Ahi)- 1 - = {0} holds trivially. □ 

For a nonsingular matrix condition 2 in Theorem 5.1 always holds trivially, and 
hence a breakdown of GMRes can only occur if the method is applied to a linear 
algebraic system with a singular matrix (this fact has been known since the method's 
introduction in 1986 [47]). Breakdowns have also been analyzed by de Sturler [10] 
in the context of the GCRO method (see the end of Section 6.1 below for further 
comments). We want to point out that it is unlikely that a random initial guess lies 
in the subspace Xq specified in Corollary 5.2 for which the GMRes method breaks 
down in the first step. However, a general hi may lead to a breakdown. To illustrate 
the problem of breakdowns in our context, we give an example that is adapted from 
[5, Example 1.1]. 

Example 5.4. Consider a linear algebraic system with 



" 


1 " 




' 1 " 


1 





, b = 






so that the unique solution is given by the vector [0, 1] T . Let the augmentation space 
be defined by Ui = [1,0] T , then 

b = P A b = * . 
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If Xo is the zero vector, then f o = b and A?o = 0, and thus GMRes applied to 
the deflated system terminates at the very first iteration with the approximation x . 
Since Axo ^ b, this is a breakdown of GMRes. Furthermore, applying the correction 
(3.7) to x = x does not yield the solution of the original system Ax = b because 

Q A x + M A A H b = M A A H b = A H b = ^ 




Corollary 5.3 states that the GMRes method applied to the deflated system (5.1) 
cannot break down if hi is an A-invariant subspace. The following example shows that 
care has also to be taken with approximate A-invariant subspaces. 

Example 5.5. Let a > be a small positive number. Then v := [0, l,a] T is an 
eigenvector of the matrix 
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1 a" 1 
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corresponding to the eigenvalue 1. Instead of v we use the perturbed vector U2 
[0, 1, 0] T as a basis for the deflation space hi = im (U2) and obtain 
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For x, b 6 C 3 with Ax = b the GMRes method then breaks down in the first step for 
all x G {x + /3[1, 0, 0] T |/3^0}. Note that ||U 2 — v|| 2 = a can be chosen arbitrarily 
small. A better measure for the quality of an approximate invariant subspace would 
be the largest principal angle between hi and ALi. 

6. Hermitian matrices and variants of MinRes. We will now apply the 
results presented in Sections 2 and 5 to the case where A is Hermitian, nonsingular, 
and possibly indefinite. For a Hermitian matrix the GMRes method considered in 
Section 5 is mathematically equivalent to the MinRes method, which is based on the 
Hermitian Lanczos algorithm, and thus uses efficient three-term recurrences. 

6.1. The RMinRes method. This subsection discusses the "recycling Min- 
Res method", or briefly RMinRes method, developed by Wang, de Sturler and 
Paulino [57]. This method fits into the framework of Section 2, and the results pre- 
sented in Section 5 apply. Wang et al. were interested in solving sequences of linear 
algebraic systems that exhibit only small changes from one matrix in the sequence 
to the next one, and they suggested to reuse information from previous solves. The 
RMinRes method consists of two main parts that can basically be analyzed sepa- 
rately: an augmented and deflated MinRes solver which is based on GCRO and an 
extraction procedure for the augmentation and deflation data. In the second part 
Wang et al. determined harmonic Ritz vectors that correspond to harmonic Ritz val- 
ues close to zero, and used these approximate eigenspaces for augmenting the Krylov 
subspace. Here, we omit the extraction of the augmentation and deflation space and 
concentrate on the method for solving the systems. We refer to this as the solver part 
of the RMinRes method. We point out that the extracted spaces can be arbitrary 
if there are no restrictions on the changes of the matrices in the sequence of linear 
algebraic systems. However, the RMinRes method has been presented in [57] with 
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an application in topology optimization where the extracted approximated eigenvec- 
tors of one matrix are still good approximations to eigenvectors of the next matrix. 
Furthermore, we will not address the preconditioning technique outlined in [57] and 
assume that the given linear algebraic system is already in the preconditioned form. 

As in Section 5, we set B = A and consider first the resulting deflated system of 
the form (5.1), 

Ax = b, where A := P A A, b := PAb. 

If we apply MinRes to this linear algebraic system with an initial guess x and the 
corresponding initial residual f q = b — Axo = PA(b — Axo), then the iterate x„ and 
the residual ?„ are characterized by the two conditions 

x„ E x +/C„(A,r ), and r„ = b - Ax„ _L A/C„(A, r ). (6.1) 

This is essentially the approach of Kilmer and de Sturler [3 'J, Section 2]. Olshanskii 
and Simoncini [42] recently used a different approach where the MinRes method 
is applied to the deflated system PjAx = Pjb with the special initial guess x = 
U(U H AU) _1 U H b. We note that the presentation in [ ] is slightly different but the 
above can be seen with minor algebraic modifications to the relations in and preceding 
Proposition 3.1 in [42]. 

An attentive reader has certainly noticed that the deflated matrix A = PaA = 
A — AMaA 2 is in general not Hermitian, even when A is Hermitian. However, as 
pointed out in [32, 57, footnotes on p. 2153 and p. 2446, respectively], a straightfor- 
ward computation shows that 

/C„(P A A, P A v) = /C„(P A AP A , Pav) (6.2) 

holds for every vector v 6 C N because Pa is a projection. The matrix PaAPa is 
obviously Hermitian (since A and Pa arc Hermitian), and hence the Krylov subspaces 
we work with are also generated by a Hermitian matrix. It is therefore possible to 
implement a MiNRES-like method for the deflated system, which is based on three- 
term recurrences and which is characterized by the conditions (6.1). As presented in 
Section 5, these conditions combined with the correction step (5.4) are equivalent to 
the explicit use of augmentation, i.e., conditions (5.2)-(5.3). 

The latter conditions are the basis of the solver part of the RMinRes method by 
Wang, de Sturler and Paulino in [57, Section 3]. We summarize the above and give 
two mathematically equivalent characterizations of the RMinRes solver applied to 
the system Ax = b with an initial guess Xo: 

1. The original approach used in [57] incorporates explicit augmentation, which 
means to construct iterates x„ satisfying the two conditions 

x„ G x + /C„(P A A, P A r ) + U, 

r„ = b Ax„ _L A/C n (P A A,P A ro) + AU. 

2. A mathematically equivalent approach is to apply MinRes to the deflated 
system 

P A Ax = P A b (6.4) 
and correct the resulting iterates x n according to x„ = Qax„ + MaAB. 
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Note that on an algorithmic level the second approach exhibits lower computa- 
tional cost since the correction in the space U is only carried out once at the end, 
while the RMinRes solver requires one update per iteration. 

Since the solver part of RMinRes is mathematically equivalent to MinRes (and 
GMRes) applied to the deflated system, Corollary 5.3 also applies to RMinRes. 
In particular, the method can break down for specific initial guesses if (and only if) 
U n (Aid) ^ {0}. Breakdowns cannot occur if U is an exact A-invariant subspace, 
but this is an unrealistic assumption in practical applications. Note that the matrix 
A in Example 5.4 is Hermitian, thus it also serves as an example for a breakdown 
of the RMinRes solver. That the RMinRes method can break down may already 
be guessed from the fact that this method is based on the GCRO method and thus 
potentially suffers from the breakdown conditions for GCRO derived in [10]. However, 
the possibility of breakdowns has not been mentioned in [57], and in the example 
of a GCRO breakdown given in [10] the matrix A is not Hermitian. Hence this 
example cannot be used in the context of the RMinRes method, which is intended 
for Hermitian matrices. 

In the next subsection we show how to suitably modify the RMinRes approach 
to avoid breakdowns. 

6.2. Avoiding breakdowns in deflated MinRes. We have seen in Section 5 
that if ker (A) = ker (A H ), then condition 1 in Theorem 5.1 is satisfied. Consequently, 
if we can determine a Hermitian deflated matrix A and a corresponding consistent 
deflated system, MinRes applied to this system cannot break down for any initial 
guess. 

Using the projections Pa and Qa from (3.6) we decompose the solution x of 
Ax = b as 

x = P A x + (I - P A )x = P A x + AM A Ax = P A x + AM A b, (6.5) 
x = Q A x + (I Qa)x = Qax + M A A 2 x = Q A x + M A Ab. (6.6) 

Using (6.6), the system Ax = b becomes A(QAX + MAAb) = b. With the definition 
of Pa and AQa = PaA (cf. Lemma 3.1) we see that this is equivalent to 

P A Ax = P A b. 

We now substitute for x from (6.5) and obtain PaA(Pax + AMAb) = PAb which 
is equivalent to 

P A AP A x = P A Q A b. (6.7) 

We can show the following result for the MinRes method applied to this sym- 
metric system. 

Theorem 6.1. For each initial guess x G C N the MinRes method applied to 
the system (6.7) yields (in exact arithmetic) a well-defined iterate x„ at every step 
n > 1 until it terminates with a solution. Moreover, the sequence of iterates 

x„ := Q A (Pax„ + AM A b) + M A Ab (6.8) 

is well defined. It terminates (in exact arithmetic) with the exact solution x of the 
original linear system Ax = b, and its residuals are given by r n = b — Ax„ = 
P A Q A b P A AP A x„. 

Proof. The first part follows from the fact that the system (6.7) is a consistent 
system with a Hermitian matrix PaAPa, so that we can apply Theorem 5.1. It 



17 



remains to show the second part. The n-th residual of the original system Ax = b is 
given by 

r„ = b - Ax„ = b A (Q A (P A x„ + AM A b) + M A Ab) 
= b AQ A (P a x„ + AM A b) AM A Ab 
= (I - AM A A) b - P A A (P A x„ + AM A b) 
= P A b P A AP A x„ P A A 2 M A b 
= P A (I A 2 M A ) b P A AP A x„ 
= P A Q A b -P A AP A x,, 

We see that r„ is equal to the n-th MinRes residual for the system (6.7). In particular, 
this implies that the exact solution of (1.1) is given by (6.8) once an exact solution 
x ra of (6.7) has been determined by MinRes. □ 

When MinRes is applied to the deflated system (6.7), the Hermitian iteration 
matrix P A AP A can again be replaced by the non-Hermitian matrix P A A (cf. Sec- 
tion 6.1). 

The following theorem shows that a modification of the initial guess suffices to 
make the solver part of the RMinRes method mathematically equivalent to MinRes 
applied to the system (6.7). 

Theorem 6.2. We consider the following two approaches: 

1. The solver part of the RMinRes method applied to Ax = b with the initial 
guess x := Pa x o + AM A b and resulting iterates x n and residuals r„ = 
b Ax„ . 

2. The MinRes method applied to (6.7) with the initial guess xo and resulting 
iterates x„ and residuals r„ := P A Q A b — P A AP A x„. 

Both approaches are equivalent in the sense that x„ = Q A (P A x„ + AM A b) + M A Ab 
and r„ = r„. 

Proof. Let us start with the MinRes method applied to (6.7), which constructs 
iterates x„ = xo + V„y„, where V„ £ tr^Nxn j g Q f f u ]j ran k n suc h that im (V„) = 
K„(PaAPa, PQ H r ). Then P A V„ = V„ and the corrected iterates are 

x„ = Q A (P A (x + V„y„) + AM A b) + M A Ab = Q A (x + V„y„) + M A Ab 
= Q A x„ + M A Ab, (6.9) 

with x„ :— Xo + V„y n . For n > the n-th residual of x n with respect to the system 
(6.7) is 

r„ = P A Q A b P A AP A x„ = P A (Q A b AP A x AV„y„) 

= P A (b - A(P A x + AM A b + V n y n )) = P A b - P A Ax„ =: ?„. 

This is the residual of x„ with respect to the system (6.4). We also have 

? = P A b - P A Ax = P A Q A b P A AP A x = r , 

and thus the starting vectors of the Krylov subspace for both methods are equal. 
Because of (6.2) also the Krylov subspaces are equal. From the definition of the 
Krylov subspaces we immediately obtain 

r„ J_ P A AP A /C„(P A AP A ,r ) r n J_ P A A/C„(P A A,f ). 

We can now see that the iterates x ra are the iterates of MinRes applied to (6.4) with 
the initial guess xo- Along with the correction (6.9) this was shown to be equivalent 
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to the RMinRes solver applied to Ax = b with the initial guess xo (cf. Section 6.1). 
□ 

This means that (in exact arithmetic) breakdowns in the solver part of the RMin- 
Res method can be prevented by either adapting the right-hand side to Q H b and 
correcting the approximate solution at the end according to Theorem 6.1 or by choos- 
ing the adapted initial guess Xo defined in Theorem 6.2. Both choices do not increase 
the computational cost significantly since these computations only need to be carried 
out once. A similar special initial guess has also been used in [54] to obtain a robust 
deflation-based preconditioner for the CG method; compare the A-DEF2 method 
in [ , Table 2]. 

6.3. Numerical experiments. In this subsection, we will show the numerical 
behavior of selected Krylov subspace methods discussed above. Detailed numerical 
experiments with the deflated CG method (cf. Section 4) and equivalent approaches 
have been presented in [54]. Here, we will focus on the solver part of the RMinRes 
method and the deflated MinRes method in order to numerically illustrate the phe- 
nomenon of breakdowns that have only been described theoretically so far (cf. Sec- 
tions 6.1 and 6.2). Both methods are implemented in MATLAB with three-term 
Lanczos recurrences and Givens rotations for solving the least squares problem. All 
residuals have been computed explicitly in each iteration. 

Example 6.3. In this example we use a matrix A = W H DW e f 2mx2m , 
m = 50, where D = diag(Ai, . . . , \2m) with Xj — i/J, \ m +j = —yfj for j = 1, . . . , m 
and W = [wi , . . . , w 2m ] is a randomly generated orthogonal matrix. We consider a 
matrix U = [u%, . . . ,Uk] whose columns are pairwise orthogonal eigenvectors of A, 
i.e. AU = UDu and U H U = with a diagonal matrix Du = diag(A J1 , . . . , Xj k ) 
for < j\ < ■ ■ ■ < jk < 2m. This means that U = im (U) is an exact A-invariant 
subspace. Then a straightforward computation reveals that Pa = Qa = I UU H , 
which is obviously Hcrmitian, and 

P A AP A = PaAQa = P A A = P A A, P a Q a = Pa = Pa 

By comparing the correction steps of RMinRes and deflated MinRes (cf. Sections 6.1 
and 6.2) and using PaAMa = 0, we can see that both methods are mathematically 
equivalent if hi is an exact invariant subspace. 

We solve the system Ax = b with a random right-hand side b and the initial 
guess xo = 0. In Figure 6.1 we show the relative residual norms of the solvers 

• MinRes (solid line), 

• RMinRes with explicit augmentation and deflation (dotted line) according 
to Wang efc si. [57]; cf. (6.3), 

• RMinRes with deflation only (dash-dotted line), i.e., the residual norms of 
MinRes applied to the system Pa Ax = PAb; cf. (6.4), 

• deflated MinRes (dashed line), i.e., the residual norms of MinRes applied 
to the system PaAPax = PaQaI 3 ; c f- Section 6.2. 

For the last three methods we used the matrix U = [wi, . . . , W5, W51, . . . , W55] which 
contains the eigenvectors associated with the 10 eigenvalues of A of smallest absolute 
value. Thus the deflation space hi has dimension 10. We have shown above that the 
two implementations of RMinRes and the deflated MinRes method are mathemat- 
ically equivalent, and in this example the three convergence curves corresponding to 
these methods indeed coincide; see Figure 6.1. 

Example 6.4. We now investigate breakdowns and near-breakdowns of the 
RMinRes method using a set of artificially constructed examples. Of course, the 
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Fig. 6.1. Convergence history for Example 6.3. The convergence curves of both RMinRes 
solver implementations and the deflated MinRes method coincide. 

occurrence of an exact breakdown as in the following examples will be rare in practi- 
cal applications. 

For our construction we use the same matrix A as in Example 6.3 and we construct 
a subspace U for which U n (AU) ^ {0}. Thus, the condition that guarantees a 
breakdown-free RMinRes computation is violated; cf. Section 6.1. To construct the 
subspace U we choose an integer k, < k < m, and we define Wi = [w^, . . . , WjJ 
and W2 = [w ra+ ,; 1 ,...,w m+ ij for indices < i\ < ■ ■ ■ < ik < m. With Du = 
diag(Aj i; • • • , Aj fc ) we obtain AW[ = \V\Du and AW 2 = — W 2 Du because of the 
symmetry of the spectrum of A. We now choose the matrix U = W1+W2. Applying 
A yields AU = (Wi — W 2 )Du and using the fact that W is unitary shows that 
U H AU = 0, or equivalently U C (AU)- 1 . The proof of Theorem 5.1 gives us a way 
to construct an initial guess which leads to an immediate breakdown of RMinRes. 
For an arbitrary 7^ u 6 U we choose x = A _1 (b — u). Because of U _L AU we 
have Pau = u and the initial residual of RMinRes is r = PAb — PaAx = u. The 
breakdown then occurs in the first iteration because PaAi-0 = PaAu = since Au € 
AU = ker(PA)- For these constructed initial guesses the RMinRes method indeed 
breaks down immediately in numerical experiments, whereas the deflated MinRes 
method finds the solution after one step. There is no need to plot these results. 

Of greater interest are situations with perturbed data. Interestingly, randomly 
perturbed initial guesses lead to a breakdown of RMinRes with the previously con- 
structed deflation space as well. In Figure 6.2(a) we show the relative residual norms 
of the solvers listed above applied to the same A and b as in the previous example 
and with the matrix U^ 1 ' = [wi + w 5 i, . . . ,w w + w 60 ]. 

Furthermore, breakdowns also occur when we perturb the deflation space. Fig- 
ure 6.2(b) shows the results for a perturbed matrix U^ 2 - 1 = UW + E with a random 
E e C 100xl ° and ||E|| 2 = 1CT 10 . The used initial guess is the same perturbed initial 
guess as in the experiment conducted for Figure 6.2(a). 

Note that both RMinRes implementations suffer from a breakdown after a few 
steps with both matrices and U^ 2 ). With the unperturbed matrix U^ 1 ) the 

deflated MinRes method converges to the solution with a relative residual smaller 
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(a) Unperturbed deflation space UW 
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(b) Perturbed deflation space U (2) = UW + E 

Fig. 6.2. Convergence history for Example 6-4- The convergence curves of both RMinRes 
solver implementations coincide. 

than 10 -12 , while in the case of the perturbed matrix U^ 2 ) the method stagnates 
with a relative residual of order 10~ n . This stagnation of deflated MinRes seems to 
be related to an unfavorable spectrum of PaAPa for these specifically constructed 
and perturbed matrices like U^ 2 ). It is unlikely that the stagnation is caused by 
roundoff errors because the stagnation also occurs (up to iteration 100) when full 
recurrences (GMRes) are used instead of short recurrences (MinRes). Perturbing 
the matrix U from Example 6.3 whose columns are exact eigenvectors of A does not 
cause stagnation. This behavior is still subject to further research. 

Note that the construction of U = im (U) in Example 6.4 is such that AU _L U 
which cannot be achieved with a (nearly) A-invariant subspace if A is Hermitian. 
In [57] an approximation to an invariant subspace of a previous matrix in a sequence 
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of linear algebraic systems is used. In this situation care has to be taken that the 
extracted space is still a good approximation to an invariant subspace of the current 
matrix. However, in the experiments of [57, Section 7] this seems to be fulfilled since 
stagnation has not been observed. 

7. Conclusions. In this paper we first analyzed theoretically the link between 
basic theoretical properties of deflated and augmented Krylov subspace methods 
whose residuals satisfy a Galerkin condition, including the minimum residual methods 
whose inclusion into the class of Galerkin methods requires a replacement of the stan- 
dard inner product. We proved that augmentation can be achieved without explicitly 
augmenting the Krylov subspace, but instead projecting the residuals appropriately 
and using a correction formula for the approximate solutions. We discussed this result 
in detail for the CG method and GMRes/MinRes methods, the main representa- 
tives of our class. It turned out that for these methods some of our results had been 
mentioned before in the literature. 

The projections which arise from the augmentation can also be used to obtain a 
deflated system. We have seen that a left-multiplication of the original system with 
the corresponding projection yields a deflated system for which the CG method and 
GMRes/MinRes methods implicitly achieve augmentation. We proved that for non- 
singular Hermitian matrices the MinRes method for the deflated system is equivalent 
to the solver part of the RMinRes method introduced in [57]. While CG never breaks 
down, GMRes, MinRes and thus RMinRes may suffer from breakdowns when used 
with the deflated systems. We stated necessary and sufficient conditions to charac- 
terize breakdowns of these minimal residual methods. For Hermitian matrices, we 
introduced the deflated MinRes method which also uses a Hermitian deflated matrix 
and proved that it cannot break down. These results were illustrated numerically. 

Our framework covers methods based on a specific type of Galerkin condition; see 
(2.1)-(2.2). It does not include methods based on other conditions, in particular those 
that in practical methods are realized using the non-Hermitian Lanczos algorithms. 
Examples for such methods are BiCG [21] and its variants including CGS [50], BiCG- 
Stab [55], and IDR(s) [51]. Extending our framework to such methods remains a 
subject of further work. 

Moreover, in this paper we did not discuss or recommend practical choices of 
deflation and/or augmentation spaces. Finding spaces that lead to an improved con- 
vergence behavior of the deflated or augmented method is a highly challenging task 
that should be attacked with a specific application in mind. Similar to precondition- 
ing, there exists no single-best strategy for choosing deflation or augmentation spaces 
in practice. Often one deflates (approximations of) eigenvectors corresponding to the 
smallest eigenvalues of the given matrix. For symmetric or Hermitian positive defi- 
nite matrices, this strategy can be shown to reduce the "effective condition number", 
which in turn leads to improved convergence bounds, and actually faster convergence 
of the iterative solver; see e.g. [56]. For non-symmetric or non- Hermitian matrices, 
however, the question of effective choices of deflation or augmentation spaces is largely 
open. 

Acknowledgements. The authors wish to thank the anonymous referees for 
their comments which helped to improve the presentation. 
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